Sensitivity Analysis
   HOME

TheInfoList



OR:

Sensitivity analysis is the study of how the
uncertainty Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable ...
in the output of a
mathematical model A mathematical model is a description of a system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used in the natural sciences (such as physics, ...
or system (numerical or otherwise) can be divided and allocated to different sources of uncertainty in its inputs. A related practice is
uncertainty analysis Uncertainty analysis investigates the uncertainty of variables that are used in decision-making problems in which observations and models represent the knowledge base. In other words, uncertainty analysis aims to make a technical contribution to ...
, which has a greater focus on uncertainty quantification and propagation of uncertainty; ideally, uncertainty and sensitivity analysis should be run in tandem. The process of recalculating outcomes under alternative assumptions to determine the impact of a variable under sensitivity analysis can be useful for a range of purposes, including: * Testing the
robustness Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...
of the results of a model or system in the presence of uncertainty. * Increased understanding of the relationships between input and output variables in a system or model. * Uncertainty reduction, through the identification of model input that cause significant uncertainty in the output and should therefore be the focus of attention in order to increase robustness (perhaps by further research). * Searching for errors in the model (by encountering unexpected relationships between inputs and outputs). * Model simplification – fixing model input that has no effect on the output, or identifying and removing redundant parts of the model structure. * Enhancing communication from modelers to decision makers (e.g. by making recommendations more credible, understandable, compelling or persuasive). * Finding regions in the space of input factors for which the model output is either maximum or minimum or meets some optimum criterion (see optimization and Monte Carlo filtering). * In case of calibrating models with large number of parameters, a primary sensitivity test can ease the calibration stage by focusing on the sensitive parameters. Not knowing the sensitivity of parameters can result in time being uselessly spent on non-sensitive ones. * To seek to identify important connections between observations, model inputs, and predictions or forecasts, leading to the development of better models.


Overview

A
mathematical model A mathematical model is a description of a system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used in the natural sciences (such as physics, ...
(for example in biology, climate change, economics or engineering) can be highly complex, and as a result, its relationships between inputs and outputs may be poorly understood. In such cases, the model can be viewed as a black box, i.e. the output is an "opaque" function of its inputs. Quite often, some or all of the model inputs are subject to sources of
uncertainty Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable ...
, including errors of measurement, absence of information and poor or partial understanding of the driving forces and mechanisms. This uncertainty imposes a limit on our confidence in the response or output of the model. Further, models may have to cope with the natural intrinsic variability of the system (aleatory), such as the occurrence of
stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...
events. Good modeling practice requires that the modeler provide an evaluation of the confidence in the model. This requires, first, a quantification of the uncertainty in any model results (
uncertainty analysis Uncertainty analysis investigates the uncertainty of variables that are used in decision-making problems in which observations and models represent the knowledge base. In other words, uncertainty analysis aims to make a technical contribution to ...
); and second, an evaluation of how much each input is contributing to the output uncertainty. Sensitivity analysis addresses the second of these issues (although uncertainty analysis is usually a necessary precursor), performing the role of ordering by importance the strength and relevance of the inputs in determining the variation in the output. In models involving many input variables, sensitivity analysis is an essential ingredient of model building and quality assurance. National and international agencies involved in impact assessment studies have included sections devoted to sensitivity analysis in their guidelines. Examples are the
European Commission The European Commission (EC) is the executive of the European Union (EU). It operates as a cabinet government, with 27 members of the Commission (informally known as "Commissioners") headed by a President. It includes an administrative body ...
(see e.g. the guidelines for impact assessment),European Commission. 2021. “Better Regulation Toolbox.” November 25.
/ref> the White House
Office of Management and Budget The Office of Management and Budget (OMB) is the largest office within the Executive Office of the President of the United States (EOP). OMB's most prominent function is to produce the president's budget, but it also examines agency programs, pol ...
, the
Intergovernmental Panel on Climate Change The Intergovernmental Panel on Climate Change (IPCC) is an intergovernmental body of the United Nations. Its job is to advance scientific knowledge about climate change caused by human activities. The World Meteorological Organization (WMO) ...
and US Environmental Protection Agency's modeling guidelines. In a comment published in 2020 in the journal
Nature Nature, in the broadest sense, is the physical world or universe. "Nature" can refer to the phenomena of the physical world, and also to life in general. The study of nature is a large, if not the only, part of science. Although humans are ...
22 scholars take
COVID-19 Coronavirus disease 2019 (COVID-19) is a contagious disease caused by a virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first known case was identified in Wuhan, China, in December 2019. The disease quick ...
as the occasion for suggesting five ways to make models serve society better. One of the five recommendations, under the heading of 'Mind the assumptions' is to 'perform global uncertainty and sensitivity analyses ..allowing all that is uncertain — variables, mathematical relationships and boundary conditions — to vary simultaneously as runs of the model produce its range of predictions.'A. Saltelli, G. Bammer, I. Bruno, E. Charters, M. Di Fiore, E. Didier, W. Nelson Espeland, J. Kay, S. Lo Piano, D. Mayo, R.J. Pielke, T. Portaluri, T.M. Porter, A. Puy, I. Rafols, J.R. Ravetz, E. Reinert, D. Sarewitz, P.B. Stark, A. Stirling, P. van der Sluijs, Jeroen P. Vineis, Five ways to ensure that models serve society: a manifesto, Nature 582 (2020) 482–484.
/ref>


Settings, constraints, and related issues


Settings and constraints

The choice of method of sensitivity analysis is typically dictated by a number of problem constraints or settings. Some of the most common are * Computational expense: Sensitivity analysis is almost always performed by running the model a (possibly large) number of times, i.e. a sampling-based approach. This can be a significant problem when, ** A single run of the model takes a significant amount of time (minutes, hours or longer). This is not unusual with very complex models. ** The model has a large number of uncertain inputs. Sensitivity analysis is essentially the exploration of the multidimensional input space, which grows exponentially in size with the number of inputs. See the curse of dimensionality. :Computational expense is a problem in many practical sensitivity analyses. Some methods of reducing computational expense include the use of emulators (for large models), and screening methods (for reducing the dimensionality of the problem). Another method is to use an event-based sensitivity analysis method for variable selection for time-constrained applications. This is an input variable selection (IVS) method that assembles together information about the trace of the changes in system inputs and outputs using sensitivity analysis to produce an input/output trigger/event matrix that is designed to map the relationships between input data as causes that trigger events and the output data that describes the actual events. The cause-effect relationship between the causes of state change i.e. input variables and the effect system output parameters determines which set of inputs have a genuine impact on a given output. The method has a clear advantage over analytical and computational IVS method since it tries to understand and interpret system state change in the shortest possible time with minimum computational overhead. * Correlated inputs: Most common sensitivity analysis methods assume
independence Independence is a condition of a person, nation, country, or state in which residents and population, or some portion thereof, exercise self-government, and usually sovereignty, over its territory. The opposite of independence is the stat ...
between model inputs, but sometimes inputs can be strongly correlated. This is still an immature field of research and definitive methods have yet to be established. * Nonlinearity: Some sensitivity analysis approaches, such as those based on
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
, can inaccurately measure sensitivity when the model response is
nonlinear In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many oth ...
with respect to its inputs. In such cases, variance-based measures are more appropriate. * Model interactions:
Interactions Interaction is action that occurs between two or more objects, with broad use in philosophy and the sciences. It may refer to: Science * Interaction hypothesis, a theory of second language acquisition * Interaction (statistics) * Interactions o ...
occur when the perturbation of two or more inputs ''simultaneously'' causes variation in the output greater than that of varying each of the inputs alone. Such interactions are present in any model that is non-
additive Additive may refer to: Mathematics * Additive function, a function in number theory * Additive map, a function that preserves the addition operation * Additive set-functionn see Sigma additivity * Additive category, a preadditive category with f ...
, but will be neglected by methods such as scatterplots and one-at-a-time perturbations. The effect of interactions can be measured by the total-order sensitivity index. * Multiple outputs: Virtually all sensitivity analysis methods consider a single
univariate In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate ...
model output, yet many models output a large number of possibly spatially or time-dependent data. Note that this does not preclude the possibility of performing different sensitivity analyses for each output of interest. However, for models in which the outputs are correlated, the sensitivity measures can be hard to interpret. * Given data: While in many cases the practitioner has access to the model, in some instances a sensitivity analysis must be performed with "given data", i.e. where the sample points (the values of the model inputs for each run) cannot be chosen by the analyst. This may occur when a sensitivity analysis has to be performed retrospectively, perhaps using data from an optimisation or uncertainty analysis, or when data comes from a
discrete Discrete may refer to: *Discrete particle or quantum in physics, for example in quantum theory * Discrete device, an electronic component with just one circuit element, either passive or active, other than an integrated circuit *Discrete group, a ...
source.


Assumptions vs. inferences

In uncertainty and sensitivity analysis there is a crucial trade off between how scrupulous an analyst is in exploring the input assumptions and how wide the resulting
inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in ...
may be. The point is well illustrated by the econometrician
Edward E. Leamer Edward Emory Leamer (born May 24, 1944) is a professor of economics and statistics at UCLA. He is Chauncey J. Medberry Professor of Management and director of the UCLA Anderson Forecast. He attended Princeton (B.A., mathematics, 1966) and the Un ...
:
I have proposed a form of organized sensitivity analysis that I call 'global sensitivity analysis' in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful.
Note Leamer's emphasis is on the need for 'credibility' in the selection of assumptions. The easiest way to invalidate a model is to demonstrate that it is fragile with respect to the uncertainty in the assumptions or to show that its assumptions have not been taken 'wide enough'. The same concept is expressed by Jerome R. Ravetz, for whom bad modeling is when ''uncertainties in inputs must be suppressed lest outputs become indeterminate.''


Pitfalls and difficulties

Some common difficulties in sensitivity analysis include * Too many model inputs to analyse. Screening can be used to reduce dimensionality. Another way to tackle the curse of dimensionality is to use sampling based on low discrepancy sequences * The model takes too long to run.
Emulators In computing, an emulator is hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run software or use peri ...
(including HDMR) can reduce the total time by accelerating the model or by reducing the number of model runs needed. * There is not enough information to build probability distributions for the inputs. Probability distributions can be constructed from
expert elicitation In science, engineering, and research, expert elicitation is the synthesis of opinions of authorities of a subject where there is uncertainty due to insufficient data or when such data is unattainable because of physical constraints or lack of res ...
, although even then it may be hard to build distributions with great confidence. The subjectivity of the probability distributions or ranges will strongly affect the sensitivity analysis. * Unclear purpose of the analysis. Different statistical tests and measures are applied to the problem and different factors rankings are obtained. The test should instead be tailored to the purpose of the analysis, e.g. one uses Monte Carlo filtering if one is interested in which factors are most responsible for generating high/low values of the output. * Too many model outputs are considered. This may be acceptable for the quality assurance of sub-models but should be avoided when presenting the results of the overall analysis. * Piecewise sensitivity. This is when one performs sensitivity analysis on one sub-model at a time. This approach is non conservative as it might overlook interactions among factors in different sub-models (Type II error). * Commonly used
OAT The oat (''Avena sativa''), sometimes called the common oat, is a species of cereal grain grown for its seed, which is known by the same name (usually in the plural, unlike other cereals and pseudocereals). While oats are suitable for human con ...
approach is not valid for nonlinear models. Global methods should be used instead.


Sensitivity analysis methods

There are a large number of approaches to performing a sensitivity analysis, many of which have been developed to address one or more of the constraints discussed above. They are also distinguished by the type of sensitivity measure, be it based on (for example) variance decompositions,
partial derivatives In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Part ...
or elementary effects. In general, however, most procedures adhere to the following outline: # Quantify the uncertainty in each input (e.g. ranges, probability distributions). Note that this can be difficult and many methods exist to elicit uncertainty distributions from subjective data. # Identify the model output to be analysed (the target of interest should ideally have a direct relation to the problem tackled by the model). # Run the model a number of times using some
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
, dictated by the method of choice and the input uncertainty. # Using the resulting model outputs, calculate the sensitivity measures of interest. In some cases this procedure will be repeated, for example in high-dimensional problems where the user has to screen out unimportant variables before performing a full sensitivity analysis. The various types of "core methods" (discussed below) are distinguished by the various sensitivity measures which are calculated. These categories can somehow overlap. Alternative ways of obtaining these measures, under the constraints of the problem, can be given.


One-at-a-time (OAT)

One of the simplest and most common approaches is that of changing one-factor-at-a-time (OAT), to see what effect this produces on the output. OAT customarily involves * moving one input variable, keeping others at their baseline (nominal) values, then, * returning the variable to its nominal value, then repeating for each of the other inputs in the same way. Sensitivity may then be measured by monitoring changes in the output, e.g. by
partial derivatives In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Part ...
or
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
. This appears a logical approach as any change observed in the output will unambiguously be due to the single variable changed. Furthermore, by changing one variable at a time, one can keep all other variables fixed to their central or baseline values. This increases the comparability of the results (all 'effects' are computed with reference to the same central point in space) and minimizes the chances of computer program crashes, more likely when several input factors are changed simultaneously. OAT is frequently preferred by modelers because of practical reasons. In case of model failure under OAT analysis the modeler immediately knows which is the input factor responsible for the failure. Despite its simplicity however, this approach does not fully explore the input space, since it does not take into account the simultaneous variation of input variables. This means that the OAT approach cannot detect the presence of
interactions Interaction is action that occurs between two or more objects, with broad use in philosophy and the sciences. It may refer to: Science * Interaction hypothesis, a theory of second language acquisition * Interaction (statistics) * Interactions o ...
between input variables and is unsuitable for nonlinear models. The proportion of input space which remains unexplored with an OAT approach grows superexponentially with the number of inputs. For example, a 3-variable parameter space which is explored one-at-a-time is equivalent to taking points along the x, y, and z axes of a cube centered at the origin. The
convex hull In geometry, the convex hull or convex envelope or convex closure of a shape is the smallest convex set that contains it. The convex hull may be defined either as the intersection of all convex sets containing a given subset of a Euclidean space ...
bounding all these points is an
octahedron In geometry, an octahedron (plural: octahedra, octahedrons) is a polyhedron with eight faces. The term is most commonly used to refer to the regular octahedron, a Platonic solid composed of eight equilateral triangles, four of which meet at ea ...
which has a volume only 1/6th of the total parameter space. More generally, the convex hull of the axes of a hyperrectangle forms a hyperoctahedron which has a volume fraction of 1/n!. With 5 inputs, the explored space already drops to less than 1% of the total parameter space. And even this is an overestimate, since the off-axis volume is not actually being sampled at all. Compare this to random sampling of the space, where the convex hull approaches the entire volume as more points are added. While the sparsity of OAT is theoretically not a concern for
linear model In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term ...
s, true linearity is rare in nature.


Derivative-based local methods

Local derivative-based methods involve taking the
partial derivative In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Part ...
of the output ''Y'' with respect to an input factor ''X''''i'' : : \left, \frac \right , _, where the subscript x0 indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling and Automated Differentiation are methods in this class. Similar to OAT, local methods do not attempt to fully explore the input space, since they examine small perturbations, typically one variable at a time. It is possible to select similar samples from derivative-based sensitivity through Neural Networks and perform uncertainty quantification. One advantages of the local methods is that it is possible to make a matrix to represent all the sensitivities in a system, thus providing an overview that cannot be achieved with global methods if there is a large number of input and output variables. Kabir HD, Khosravi A, Nahavandi D, Nahavandi S. Uncertainty Quantification Neural Network from Similarity and Sensitivity. In2020 International Joint Conference on Neural Networks (IJCNN) 2020 Jul 19 (pp. 1-8). IEEE.
/ref>


Regression analysis

Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
, in the context of sensitivity analysis, involves fitting a
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
to the model response and using standardized regression coefficients as direct measures of sensitivity. The regression is required to be linear with respect to the data (i.e. a hyperplane, hence with no quadratic terms, etc., as regressors) because otherwise it is difficult to interpret the standardised coefficients. This method is therefore most suitable when the model response is in fact linear; linearity can be confirmed, for instance, if the
coefficient of determination In statistics, the coefficient of determination, denoted ''R''2 or ''r''2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s). It is a statistic used i ...
is large. The advantages of regression analysis are that it is simple and has a low computational cost.


Variance-based methods

Variance-based methods are a class of probabilistic approaches which quantify the input and output uncertainties as
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
s, and decompose the output variance into parts attributable to input variables and combinations of variables. The sensitivity of the output to an input variable is therefore measured by the amount of variance in the output caused by that input. These can be expressed as conditional expectations, i.e., considering a model ''Y'' = ''f''(''X'') for ''X'' = , a measure of sensitivity of the ''i''th variable ''X''''i'' is given as, : \operatorname \left( E_ \left( Y \mid X_i \right) \right) where "Var" and "''E''" denote the variance and expected value operators respectively, and ''X''''~i'' denotes the set of all input variables except ''X''''i''. This expression essentially measures the contribution ''X''''i'' alone to the uncertainty (variance) in ''Y'' (averaged over variations in other variables), and is known as the ''first-order sensitivity index'' or ''main effect index''. Importantly, it does not measure the uncertainty caused by interactions with other variables. A further measure, known as the ''total effect index'', gives the total variance in ''Y'' caused by ''X''''i'' ''and'' its interactions with any of the other input variables. Both quantities are typically standardised by dividing by Var(''Y''). Variance-based methods allow full exploration of the input space, accounting for interactions, and nonlinear responses. For these reasons they are widely used when it is feasible to calculate them. Typically this calculation involves the use of
Monte Carlo Monte Carlo (; ; french: Monte-Carlo , or colloquially ''Monte-Carl'' ; lij, Munte Carlu ; ) is officially an administrative area of the Principality of Monaco, specifically the ward of Monte Carlo/Spélugues, where the Monte Carlo Casino is ...
methods, but since this can involve many thousands of model runs, other methods (such as emulators) can be used to reduce computational expense when necessary. Note that full variance decompositions are only meaningful when the input factors are independent from one another.


Variogram analysis of response surfaces (''VARS'')

One of the major shortcomings of the previous sensitivity analysis methods is that none of them considers the spatially ordered structure of the response surface/output of the model ''Y''=''f''(''X'') in the parameter space. By utilizing the concepts of directional
variogram In spatial statistics the theoretical variogram 2\gamma(\mathbf_1,\mathbf_2) is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z(\mathbf). The semivariogram \gamma(\mathbf_1,\mathbf_2) is ...
s and covariograms, variogram analysis of response surfaces (VARS) addresses this weakness through recognizing a spatially continuous correlation structure to the values of ''Y'', and hence also to the values of \frac . Basically, the higher the variability the more heterogeneous is the response surface along a particular direction/parameter, at a specific perturbation scale. Accordingly, in the VARS framework, the values of directional
variogram In spatial statistics the theoretical variogram 2\gamma(\mathbf_1,\mathbf_2) is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z(\mathbf). The semivariogram \gamma(\mathbf_1,\mathbf_2) is ...
s for a given perturbation scale can be considered as a comprehensive illustration of sensitivity information, through linking variogram analysis to both direction and perturbation scale concepts. As a result, the VARS framework accounts for the fact that sensitivity is a scale-dependent concept, and thus overcomes the scale issue of traditional sensitivity analysis methods. More importantly, VARS is able to provide relatively stable and statistically robust estimates of parameter sensitivity with much lower computational cost than other strategies (about two orders of magnitude more efficient). Noteworthy, it has been shown that there is a theoretical link between the VARS framework and the variance-based and derivative-based approaches.


Screening

Screening is a particular instance of a sampling-based method. The objective here is rather to identify which input variables are contributing significantly to the output uncertainty in high-dimensionality models, rather than exactly quantifying sensitivity (i.e. in terms of variance). Screening tends to have a relatively low computational cost when compared to other approaches, and can be used in a preliminary analysis to weed out uninfluential variables before applying a more informative analysis to the remaining set. One of the most commonly used screening method is the elementary effect method.


Scatter plots

A simple but useful tool is to plot
scatter plots A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data ...
of the output variable against individual input variables, after (randomly) sampling the model over its input distributions. The advantage of this approach is that it can also deal with "given data", i.e., a set of arbitrarily-placed data points, and gives a direct visual indication of sensitivity. Quantitative measures can also be drawn, for example by measuring the
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
between ''Y'' and ''X''''i'', or even by estimating variance-based measures by
nonlinear regression In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fi ...
.


Alternative methods

A number of methods have been developed to overcome some of the constraints discussed above, which would otherwise make the estimation of sensitivity measures infeasible (most often due to computational expense). Generally, these methods focus on efficiently calculating variance-based measures of sensitivity.


Emulators

Emulators (also known as metamodels, surrogate models or response surfaces) are data-modeling/
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
approaches that involve building a relatively simple mathematical function, known as an ''emulator'', that approximates the input/output behavior of the model itself. In other words, it is the concept of "modeling a model" (hence the name "metamodel"). The idea is that, although computer models may be a very complex series of equations that can take a long time to solve, they can always be regarded as a function of their inputs ''Y'' = ''f''(''X''). By running the model at a number of points in the input space, it may be possible to fit a much simpler emulator ''η''(''X''), such that ''η''(''X'') ≈ ''f''(''X'') to within an acceptable margin of error. Then, sensitivity measures can be calculated from the emulator (either with Monte Carlo or analytically), which will have a negligible additional computational cost. Importantly, the number of model runs required to fit the emulator can be orders of magnitude less than the number of runs required to directly estimate the sensitivity measures from the model. Clearly, the crux of an emulator approach is to find an ''η'' (emulator) that is a sufficiently close approximation to the model ''f''. This requires the following steps, # Sampling (running) the model at a number of points in its input space. This requires a sample design. # Selecting a type of emulator (mathematical function) to use. # "Training" the emulator using the sample data from the model – this generally involves adjusting the emulator parameters until the emulator mimics the true model as well as possible. Sampling the model can often be done with
low-discrepancy sequences In mathematics, a low-discrepancy sequence is a sequence with the property that for all values of ''N'', its subsequence ''x''1, ..., ''x'N'' has a low discrepancy. Roughly speaking, the discrepancy of a sequence is low if the proportion of poi ...
, such as the
Sobol sequence Sobol sequences (also called LPτ sequences or (''t'', ''s'') sequences in base 2) are an example of quasi-random low-discrepancy sequences. They were first introduced by the Russian mathematician Ilya M. Sobol (Илья Меерович ...
– due to mathematician
Ilya M. Sobol Ilya Meyerovich Sobol’ (russian: Илья Меерович Соболь; born 15 August 1926) is a Russian mathematician, known for his work on Monte Carlo methods. His research spans several applications, from nuclear studies to astrophysics, ...
or
Latin hypercube sampling Latin hypercube sampling (LHS) is a statistical method for generating a near-random sample of parameter values from a multidimensional distribution. The sampling method is often used to construct computer experiments or for Monte Carlo integration ...
, although random designs can also be used, at the loss of some efficiency. The selection of the emulator type and the training are intrinsically linked since the training method will be dependent on the class of emulator. Some types of emulators that have been used successfully for sensitivity analysis include, * Gaussian processes (also known as
kriging In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...
), where any combination of output points is assumed to be distributed as a
multivariate Gaussian distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
. Recently, "treed" Gaussian processes have been used to deal with
heteroscedastic In statistics, a sequence (or a vector) of random variables is homoscedastic () if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The s ...
and discontinuous responses. * Random forests, in which a large number of
decision trees A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains condit ...
are trained, and the result averaged. *
Gradient boosting Gradient boosting is a machine learning technique used in regression and classification tasks, among others. It gives a prediction model in the form of an ensemble of weak prediction models, which are typically decision trees. When a decision t ...
, where a succession of simple regressions are used to weight data points to sequentially reduce error. * Polynomial chaos expansions, which use
orthogonal polynomials In mathematics, an orthogonal polynomial sequence is a family of polynomials such that any two different polynomials in the sequence are orthogonal to each other under some inner product. The most widely used orthogonal polynomials are the class ...
to approximate the response surface. *
Smoothing spline Smoothing splines are function estimates, \hat f(x), obtained from a set of noisy observations y_i of the target f(x_i), in order to balance a measure of goodness of fit of \hat f(x_i) to y_i with a derivative based measure of the smoothness of \ ...
s, normally used in conjunction with HDMR truncations (see below). * Discrete
Bayesian networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...
, in conjunction with canonical models such as noisy models. Noisy models exploit information on the conditional independence between variables to significantly reduce dimensionality. The use of an emulator introduces a
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
problem, which can be difficult if the response of the model is highly
nonlinear In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many othe ...
. In all cases, it is useful to check the accuracy of the emulator, for example using cross-validation.


High-dimensional model representations (HDMR)

A high-dimensional model representation (HDMR) (the term is due to H. Rabitz) is essentially an emulator approach, which involves decomposing the function output into a linear combination of input terms and interactions of increasing dimensionality. The HDMR approach exploits the fact that the model can usually be well-approximated by neglecting higher-order interactions (second or third-order and above). The terms in the truncated series can then each be approximated by e.g. polynomials or splines (REFS) and the response expressed as the sum of the main effects and interactions up to the truncation order. From this perspective, HDMRs can be seen as emulators which neglect high-order interactions; the advantage is that they are able to emulate models with higher dimensionality than full-order emulators.


Fourier amplitude sensitivity test (FAST)

The Fourier amplitude sensitivity test (FAST) uses the
Fourier series A Fourier series () is a summation of harmonically related sinusoidal functions, also known as components or harmonics. The result of the summation is a periodic function whose functional form is determined by the choices of cycle length (or ''p ...
to represent a multivariate function (the model) in the frequency domain, using a single frequency variable. Therefore, the integrals required to calculate sensitivity indices become univariate, resulting in computational savings.


Other

Methods based on Monte Carlo filtering. These are also sampling-based and the objective here is to identify regions in the space of the input factors corresponding to particular values (e.g. high or low) of the output.


Applications

Examples of sensitivity analyses can be found in various area of application, such as: *
Environmental sciences Environmental science is an interdisciplinary academic field that integrates physics, biology, and geography (including ecology, chemistry, plant science, zoology, mineralogy, oceanography, limnology, soil science, geology and physical geo ...
*
Business Business is the practice of making one's living or making money by producing or Trade, buying and selling Product (business), products (such as goods and Service (economics), services). It is also "any activity or enterprise entered into for pr ...
*Social sciences *Chemistry *Engineering *
Epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evidenc ...
*Meta-analysis *
Multi-criteria decision making Multiple-criteria decision-making (MCDM) or multiple-criteria decision analysis (MCDA) is a sub-discipline of operations research that explicitly evaluates multiple conflicting criteria in decision making (both in daily life and in settings s ...
*Time-critical decision making * Model calibration *
Uncertainty Quantification Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system a ...


Sensitivity auditing

It may happen that a sensitivity analysis of a model-based study is meant to underpin an inference, and to certify its robustness, in a context where the inference feeds into a policy or decision-making process. In these cases the framing of the analysis itself, its institutional context, and the motivations of its author may become a matter of great importance, and a pure sensitivity analysis – with its emphasis on parametric uncertainty – may be seen as insufficient. The emphasis on the framing may derive inter-alia from the relevance of the policy study to different constituencies that are characterized by different norms and values, and hence by a different story about 'what the problem is' and foremost about 'who is telling the story'. Most often the framing includes more or less implicit assumptions, which could be political (e.g. which group needs to be protected) all the way to technical (e.g. which variable can be treated as a constant). In order to take these concerns into due consideration the instruments of SA have been extended to provide an assessment of the entire knowledge and model generating process. This approach has been called 'sensitivity auditing'. It takes inspiration from NUSAP, a method used to qualify the worth of quantitative information with the generation of `Pedigrees' of numbers. Likewise, sensitivity auditing has been developed to provide pedigrees of models and model-based inferences. Sensitivity auditing has been especially designed for an adversarial context, where not only the nature of the evidence, but also the degree of certainty and uncertainty associated to the evidence, will be the subject of partisan interests. Sensitivity auditing is recommended in the European Commission guidelines for impact assessment, as well as in the report Science Advice for Policy by European Academies.Science Advice for Policy by European Academies, Making sense of science for policy under conditions of complexity and uncertainty, Berlin, 2019.


Related concepts

Sensitivity analysis is closely related with uncertainty analysis; while the latter studies the overall
uncertainty Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable ...
in the conclusions of the study, sensitivity analysis tries to identify what source of uncertainty weighs more on the study's conclusions. The problem setting in sensitivity analysis also has strong similarities with the field of
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
.Box GEP, Hunter WG, Hunter, J. Stuart. Statistics for experimenters nternet New York: Wiley & Sons In a design of experiments, one studies the effect of some process or intervention (the 'treatment') on some objects (the 'experimental units'). In sensitivity analysis one looks at the effect of varying the inputs of a mathematical model on the output of the model itself. In both disciplines one strives to obtain information from the system with a minimum of physical or numerical experiments.


See also

*
Causality Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cau ...
* Elementary effects method *
Experimental uncertainty analysis Experimental uncertainty analysis is a technique that analyses a ''derived'' quantity, based on the uncertainties in the experimentally ''measured'' quantities that are used in some form of mathematical relationship ("model") to calculate that d ...
*
Fourier amplitude sensitivity testing Fourier amplitude sensitivity testing (FAST) is a variance-based global sensitivity analysis method. The sensitivity value is defined based on conditional variances which indicate the individual or joint effects of the uncertain inputs on the output ...
*
Info-gap decision theory Info-gap decision theory seeks to optimize robustness to failure under severe uncertainty,Yakov Ben-Haim, ''Information-Gap Theory: Decisions Under Severe Uncertainty,'' Academic Press, London, 2001.Yakov Ben-Haim, ''Info-Gap Theory: Decisions Unde ...
* Interval FEM *
Perturbation analysis In mathematics and applied mathematics, perturbation theory comprises methods for finding an approximate solution to a problem, by starting from the exact solution of a related, simpler problem. A critical feature of the technique is a middle ...
*
Probabilistic design Probabilistic design is a discipline within engineering design. It deals primarily with the consideration of the effects of random variability upon the performance of an engineering system during the design phase. Typically, these effects are re ...
*
Probability bounds analysis Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random varia ...
* Robustification *
ROC curve A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The method was originally developed for operators of m ...
*
Uncertainty quantification Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system a ...
*
Variance-based sensitivity analysis Variance-based sensitivity analysis (often referred to as the Sobol method or Sobol indices, after Ilya M. Sobol) is a form of global sensitivity analysis.Sobol,I.M. (2001), Global sensitivity indices for nonlinear mathematical models and their Mon ...


References


Further reading

* *Fassò A. (2007) "Statistical sensitivity analysis and water quality". In Wymer L. Ed, ''Statistical Framework for Water Quality Criteria and Monitoring''. Wiley, New York. *Fassò A., Perri P.F. (2002) "Sensitivity Analysis". In Abdel H. El-Shaarawi and Walter W. Piegorsch (eds) ''Encyclopedia of Environmetrics'', Volume 4, pp 1968–1982, Wiley. *Fassò A., Esposito E., Porcu E., Reverberi A.P., Vegliò F. (2003) "Statistical Sensitivity Analysis of Packed Column Reactors for Contaminated Wastewater". ''Environmetrics''. Vol. 14, n.8, 743–759. *Haug, Edward J.; Choi, Kyung K.; Komkov, Vadim (1986) ''Design sensitivity analysis of structural systems''. Mathematics in Science and Engineering, 177. Academic Press, Inc., Orlando, FL. * *Pilkey, O. H. and L. Pilkey-Jarvis (2007), ''Useless Arithmetic. Why Environmental Scientists Can't Predict the Future.'' New York: Columbia University Press. *Santner, T. J.; Williams, B. J.; Notz, W.I. (2003) ''Design and Analysis of Computer Experiments''; Springer-Verlag. *Taleb, N. N., (2007) ''The Black Swan: The Impact of the Highly Improbable,'' Random House.


Special issues


Reliability Engineering & System Safety
2003, 79:121–2: SAMO 2001: Methodological advances and innovative applications of sensitivity analysis, edited by Tarantola S, Saltelli.
Reliability Engineering & System Safety
Volume 91, 2006, Special issue on sensitivity analysis, edited by Helton JC, Cooke RM, McKay MD, Saltelli.
International Journal of Chemical Kinetics
2008, Volume 40, Issue 11 – Special Issue on Sensitivity Analysis,edited by Turányi T.
Reliability Engineering & System Safety
Volume 94, Issue 7,Pages 1133-1244 (July 2009), Special Issue on Sensitivity Analysis, edited by Andrea Saltelli.
Reliability Engineering & System Safety
Volume 107, November 2012, Advances in sensitivity analysis, SAMO 2010, edited by Borgonovo E, Tarantola S.
Journal of Statistical Computation and Simulation
Volume 85, 2015 - Issue 7: Special Issue: Selected Papers from the 7th International Conference on Sensitivity Analysis of Model Output, July 2013, Nice, France, edited by David Ginsbourger, Bertrand Iooss & Luc Pronzato.
Reliability Engineering & System Safety
Volume 134, February 2015, edited by Thierry A. Mara, and Stefano Tarantola.
Reliability Engineering & System Safety
Volume 187, July 2019, edited by Stefano Tarantola, and Nathalie Saint-Geours.
Reliability Engineering & System Safety
Volume 212, August 2021, edited by Bertrand Iooss, Bruno Sudret, Samuele Lo Piano and Clémentine Prieur.
Environmental Modelling & Software
Special issue: Sensitivity analysis for environmental modelling (2021), Edited by Saman Razavi, Andrea Saltelli, Tony Jakeman, Qiongli Wu.


External links

* Joseph Hart, Julie Bessac, Emil Constantinescu (2018), "Global sensitivity analysis for statistical model parameters",
web-page on Sensitivity analysis
– (Joint Research Centre of the European Commission)
SimLab
the free software for global sensitivity analysis of the Joint Research Centre

{{Webarchive, url=https://web.archive.org/web/20130424121555/http://www.mucm.ac.uk/index.html , date=2013-04-24 – Extensive resources for uncertainty and sensitivity analysis of computationally-demanding models. Simulation Business terms Mathematical modeling Mathematical and quantitative methods (economics)